Skip to content

feat(results): git-native storage — design doc + implementation#1261

Draft
christso wants to merge 18 commits into
mainfrom
feat/git-native-results
Draft

feat(results): git-native storage — design doc + implementation#1261
christso wants to merge 18 commits into
mainfrom
feat/git-native-results

Conversation

@christso
Copy link
Copy Markdown
Collaborator

⚠️ Draft for handoff — currently only contains the design doc. Implementation pending.

Supersedes closed PR #1260 (P1 append-only index approach). Implements issue #1259 with the design-pivoted architecture.

Summary

Replaces the per-manifest read bottleneck with a git-native model:

  • Git is the canonical store; local clone is the working copy
  • git ls-tree + git cat-file --batch for listing (no checkout, no separate index file)
  • Eval writes directly to local clone (no more `.agentv/results/runs/`)
  • Cursor pagination
  • `mode: github` explicit in config (extension point mirroring skillfully's pattern)

Why a design pivot

P1 (PR #1260) introduced `index/runs.jsonl` as an append-only index. It worked but added a second source of truth that:

  • Could drift from actual repo contents
  • Grew forever with no pruning
  • Required a sha-amend dance during push
  • Required a `reindex` migration command

The cleaner approach: the git tree IS the index. `benchmark.json` already exists per run with all listing metadata. `git ls-tree` enumerates runs without reading any blob. `git cat-file --batch` reads existing `benchmark.json` blobs in one subprocess call. No new file, no drift, naturally prunes.

See `docs/plans/git-native-results.md` for the full design rationale and implementation breakdown.

Sub-task status under the new design

Sub-task Status
P1 (index file) ❌ Obsolete
P2 (in-memory cache) ⏸ Deferred
P3 (git object DB reads) ✅ This PR
P4 (pagination) ✅ This PR
P5 (zero-config same-repo) ⏸ Deferred
P6 (commit trailer) ✅ This PR

Breaking changes accepted

  • `results.repo` becomes required
  • `results.path` repurposed from "subdir within remote repo" → "local clone filesystem path"
  • No more `.agentv/results/runs/` writes (project-local results removed)
  • `cache_dir` → `local_dir` in status responses

No production users yet, so breaking changes are acceptable. Documented in plan + release notes.

Implementation passes

See plan for details. Recommended order:

  1. Config + path renames
  2. Write path (direct-to-clone, commit trailer)
  3. Read path (`git ls-tree` + `git cat-file --batch`)
  4. Pagination (cursor)
  5. Cleanup (remove P1 scope, update docs/examples)

Test plan

  • Unit tests for `git ls-tree` / `git cat-file --batch` parsing
  • Integration test with tmp git repo: write → list → assert
  • Pagination unit tests (cursor in/out of bounds)
  • E2E: real eval against test results repo; verify commit + trailer + Studio rendering
  • All 1782 core + 553 CLI existing tests pass
  • Lint / typecheck clean
  • Manual red/green UAT documented before marking ready

Handoff notes for next agent

  • Worktree: `agentv.worktrees/git-native-results/` (this branch, from `origin/main` at af118c6)
  • Read the plan first — it's the source of truth for this PR's scope
  • Old worktree at `agentv.worktrees/p1-runs-index/` (closed PR feat(results): append-only run index for fast Studio list view #1260) can be removed once you confirm nothing's needed from it
  • AGENTS.md rules apply: red/green UAT is BLOCKING before marking ready for review

🤖 Generated with Claude Code

Captures the agreed architecture before implementation:
- Git is the canonical store; local clone is the working copy
- No separate index file — git tree IS the index
- Eval writes directly to clone working tree (not project-local .agentv/results/)
- Reads via git ls-tree + git cat-file --batch (no checkout)
- Pagination via cursor
- mode: github explicit in config (extension point)

Supersedes closed PR #1260. See docs/plans/git-native-results.md for full design.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 21, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8b9e112
Status: ✅  Deploy successful!
Preview URL: https://bce10e3c.agentv.pages.dev
Branch Preview URL: https://feat-git-native-results.agentv.pages.dev

View logs

christso and others added 17 commits May 21, 2026 19:36
- Add `mode: 'github'` as required field to ResultsConfig
- Repurpose `results.path` as optional local filesystem path for clone
  (default: ~/.agentv/results/<slug>/); reject old-style subdir values
  (e.g. 'runs') with a migration message
- Rename ResultsRepoCachePaths → ResultsRepoLocalPaths
- Rename getResultsRepoCachePaths → getResultsRepoLocalPaths
- Rename cache_dir → local_dir in ResultsRepoStatus wire format
- normalizeResultsConfig: fill default path, expand ~, include mode
- Remove redundant local normalizeResultsConfig copy in remote.ts
- Update config-validator.ts to enforce mode and filesystem-path rule
- Update tests for new schema

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix biome string-concat lint error (single template literal)
- resolveResultsRepoRunsDir: use normalized.path directly (new design)
- getResultsRepoStatus: check existsSync(normalized.path) for available,
  set local_dir to normalized.path
- serve.test.ts: update two tests to use mode:github schema and new
  default path layout (~/.agentv/results/<slug>/runs/...)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added listGitRuns() using git ls-tree + cat-file --batch
- Improved batch parser
- Saved implementation goal document

This is early progress toward the full git-native results implementation.
More to come in follow-up commits.
- Enrich GitListedRun with display_name, test_count, avg_score, size_bytes
- Update remote.ts mapping to populate ResultFileMeta fields
- Read path now returns data Studio can render
- Add user: ${UID}:${GID} to docker-compose for mounted repo permissions
- Update goal document with current status
- Reinstall dependencies in worktree
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant